Analysis-by-synthesis speech coding method with open-loop and closed-loop search of a long-term prediction delay

ABSTRACT

A method comprising the steps of performing a linear prediction analysis of a speech signal digitized in a series of frames divided into sub-frames, in order to determine the parameters of a short-term synthesis filter; carrying out an open loop analysis to detect voiced signal frames and determine, for each voice frame, a degree of signal voicing and a long-term prediction delay search interval containing a number of delays depending on the degree of voicing; carrying out a closed-loop predictive analysis of the speech signal to select, for at least some sub-frames of the voiced frames, a long-term prediction delay contained in the search interval and constituting a long-term synthesis filter parameter; and determining a stochastic excitation for each sub-frame, to minimize a perceptually weighted deviation between the speech signal and the stochastic excitation filtered by the long-term and short-term synthesis filters.

BACKGROUND OF THE INVENTION

The present invention relates to analysis-by-synthesis speech coding.

The applicant company has particularly described such speech coders,which it has developed, in its European patent applications 0 195 487, 0347 307 and 0 469 997.

In an analysis-by-synthesis speech coder, linear prediction of thespeech signal is performed in order to obtain the coefficients of ashort-term synthesis filter modelling the transfer function of the vocaltract. These coefficients are passed to the decoder, as well asparameters characterising an excitation to be applied to the short-termsynthesis filter. In the majority of present-day coders, the longer-termcorrelations of the speech signal are also sought in order tocharacterise a long-term synthesis filter taking account of the pitch ofthe speech. When the signal is voiced, the excitation in fact includes apredictable component which can be represented by the past excitation,delayed by TP samples of the speech signal and subjected to a gaing_(p). The long-term synthesis filter, also reconstituted at thedecoder, then has a transfer function of the form 1/B(z) withB(z)=1-g_(p).z^(-TP). The remaining, unpredictable part of theexcitation is called stochastic excitation. In the coders known as CELP("Code Excited Linear Prediction") coders, the stochastic excitationconsists of a vector looked up in a predetermined dictionary. In thecoders known as MPLPC ("Multi-Pulse Linear Prediction Coding") coders,the stochastic excitation includes a certain number of pulses thepositions of which are sought by the coder. In general, CELP coders arepreferred for low data transmission rates, but they are more complex toimplement than MPLPC coders.

In order to determine the long-term prediction delay, a closed-loopanalysis, an open-loop analysis or a combination of the two is used. Theopen-loop analysis is not demanding in terms of amount of calculation,but its accuracy is limited. Conversely, the closed-loop analysisrequires much calculation, but it is more reliable as it contributesdirectly to minimising the perceptually weighted difference between thespeech signal and the synthetic signal. In certain cases, an open-loopanalysis is carried out first of all in order to limit the intervalwithin which the closed-loop analyser will search for the predictiondelay. This search interval must nevertheless remain relatively wide,since account has to be taken of the fact that that the delay may varyrapidly.

The invention aims particularly to find a good compromise between thequality of the modelling of the long-term part of the excitation and thecomplexity of the search for the corresponding delay in a speech coder.

SUMMARY OF THE INVENTION

The invention thus proposes an analysis-by-synthesis speech codingmethod for coding a speech signal digitised into successive frames whichare divided into most sub-frames, comprising the following steps :linear prediction analysis of the speech signal in order to determineparameters of a short-term synthesis filter ; open-loop analysis of thespeech signal in order to detect the voiced frames of the signal and inorder, for each voiced frame, to determine a degree of voicing of thesignal and an interval for searching for a long-term prediction delay ;closed-loop predictive analysis of the speech signal in order, for atleast some of the sub-frames of the voiced frames, to select a long-termprediction delay contained in the search interval and constituting aparameter of a long-term synthesis filter ; and determination of astochastic excitation for each sub-frame, so as to minimise aperceptually weighted difference between the speech signal and thestochastic excitation filtered by the long-term and short-term synthesisfilters. In the open-loop analysis step, the search interval relating toeach voiced frame is determined so that it contains a number of delayswhich is dependent on the degree of voicing of said frame.

Hence, the number of delays which are to be tested in closed-loop modecan be matched to the mode of voicing of the frame. In general, thewidth of the search interval will be less for the most voiced frames soas to take account of their higher harmonic stability. For these veryvoiced frames, one or more bits can be saved on the differentialquantification of the delay in the search interval, and this bit orthese bits saved can be reallocated to perceptually importantparameters, such as the long-term prediction gain, which improves thequality of reproduction of the speech.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a radio communications stationincorporating a speech coder implementing the invention;

FIG. 2 is a block diagram of a radio communications station able toreceive a signal produced by the station of FIG. 1;

FIGS. 3 to 6 are flow charts illustrating a process of open-loop LTPanalysis applied in the speech coder of FIG. 1.

FIG. 7 is a flow chart illustrating a process for determining theimpulse response of the weighted synthesis filter applied in the speechcoder of FIG. 1;

FIGS. 8 to 11 are flow charts illustrating a process of searching forthe stochastic excitation applied in the speech coder of FIG. 1.

DESCRIPTION OF PREFERRED EMBODIMENTS

A speech coder implementing the invention is applicable in various typesof speech transmission and/or storage systems relying on a digitalcompression technique. In the example of FIG. 1, the speech coder 16forms part of a mobile radio communications station. The speech signal Sis a digital signal sampled at a frequency typically equal to 8 kHz. Thesignal S is output by an analogue-digital converter 18 receiving theamplified and filtered output signal from a microphone 20. The converter18 puts the speech signal S into the form of successive frames which arethemselves subdivided into nst sub-frames of lst samples. A 20 ms frametypically includes nst=4 sub-frames of lst=40 samples of 16 bits at 8kHz. Upstream of the coder 16, the speech signal S may also be subjectedto conventional shaping processes such as Hamming filtering. The speechcoder 16 delivers a binary sequence with a data rate substantially lowerthan that of the speech signal S, and applies this sequence to a channelcoder 22, the function of which is to introduce redundancy bits into thesignal so as to permit detection and/or correction of any transmissionerrors. The output signal from the channel coder 22 is then modulatedonto a carrier frequency by the modulator 24, and the modulated signalis transmitted on the air interface.

The speech coder 16 is an analysis-by-synthesis coder. The coder 16, onthe one hand, determines parameters characterising a short-termsynthesis filter modelling the speaker's vocal tract, and, on the otherhand, an excitation sequence which, applied to the short-term synthesisfilter, supplies a synthetic signal constituting an estimate of thespeech signal S according to a perceptual weighting criterion.

The short-term synthesis filter has a transfer function of the form1/A(z), with: ##EQU1##

The coefficients a_(i) are determined by a module 26 for short-termlinear prediction analysis of the speech signal S. The a_(i) 's are thecoefficients of linear prediction of the speech signal S. The order q ofthe linear prediction is typically of the order of 10. The methods whichcan be applied by the module 26 for the short-term linear prediction arewell known in the field of speech coding. The module 26, for example,implements the Durbin-Levinson algorithm (see J. Makhoul: "LinearPrediction: A tutorial review", Proc. IEEE, Vol. 63, no. 4, April 1975,p.561-580). The coefficients a_(i) obtained are supplied to a module 28which converts them into line spectrum parameters (LSP). Therepresentation of the prediction coefficients a_(i) by LSP parameters isfrequently used in analysis-by-synthesis speech coders. The LSPparameters are the q numbers cos(2πf_(i)) ranged in decreasing order,the q normalised line spectrum frequencies (LSF) f_(i) (1≦i≦q) beingsuch that the complex numbers exp(2πjf_(i)), with i=1, 3, . . . , q-1,q+1 and f_(q+1) =0.5, are the roots of the polynomial Q(z) defined byQ(z)=A(z)+z⁻(q+1).A(z⁻¹) and that the complex numbers exp(2πjf_(i)),with i=0, 2, 4, . . . q and f₀ =0, are the roots of the polynomial Q*(z)defined by Q*(z)=A(z)-z⁻(q+1).A(z⁻¹).

The LSP parameters may be obtained by the conversion module 28 by theconventional method of Chebyshev polynomials (see P. Kabal and R. PRamachandran: "The computation of line spectral frequencies usingChebyshev polynomials", IEEE Trans. ASSP, Vol. 34, no. 6, 1986, pages1419-1426). It is these values of quantification of the LSP parameters,obtained by a quantification module 30, which are forwarded to thedecoder for it to recover the coefficients a_(i) of the short-termsynthesis filter. The coefficients a_(i) may be recovered simply, giventhat: ##EQU2##

In order to avoid abrupt variations in the transfer function of theshort-term synthesis filter, the LSP parameters are subject tointerpolation before the prediction coefficients a_(i) are deduced fromthem. This interpolation is performed on the first sub-frames of eachframe of the signal. For example, if LSP_(t) and LSP_(t-1) respectivelydesignate an LSP parameter calculated for frame t and for the precedingframe t-1, then LSP_(t) (0)=0.5LSP_(t-1) +0.5LSP_(t), LSP_(t)(1)=0.25LSP_(t-1) +0.75LSP_(t) and LSP_(t) (2)=. . . =LSP_(t)(nst-1)=LSP_(t) for the sub-frames 0, 1, 2, . . . , nst-1 of frame t.The coefficients a_(i) of the 1/A(z) filter are then determined,sub-frame by sub-frame, on the basis of the interpolated LSP parameters.

The unquantified LSP parameters are supplied by the module 28 to amodule 32 for calculating the coefficients of a perceptual weightingfilter 34. The perceptual weighting filter 34 preferably has a transferfunction of the form W(z)=A(z/γ₁)/A(z/γ₂) where γ₁ and γ₂ arecoefficients such that γ₁ >γ₂ >0 (for example, γ₁ =0.9 and γ₂ =0.6). Thecoefficients of the perceptual weighting filter are calculated by themodule 32 for each sub-frame after interpolation of the LSP parametersreceived from the module 28.

The perceptual weighting filter 34 receives the speech signal S anddelivers a perceptually weighted signal SW which is analysed by modules36, 38, 40 in order to determine the excitation sequence. The excitationsequence of the short-term filter consists of an excitation which can bepredicted by a long-term synthesis filter modelling the pitch of thespeech, and of an unpredictable stochastic excitation, or innovationsequence.

The module 36 performs a long-term prediction (LTP) in open loop, thatis to say that it does not contribute directly to minimising theweighted error. In the case represented, the weighting filter 34intervenes upstream of the open-loop analysis module, but it could beotherwise: the module 36 could act directly on the speech signal S, oreven on the signal S with its short-term correlations removed by afilter with transfer function A(z). On the other hand, the modules 38and 40 operate in closed loop, that is to say that they contributedirectly to minimising the perceptually weighted error.

The long-term synthesis filter has a transfer function of the form1/B(z), with B(z)=1-g_(p).z^(-TP), in which g_(p) designates a long-termprediction gain and TP designates a long-term prediction delay. Thelong-term prediction delay may typically take N=256 values lying betweenrmin and rmax samples. Fractional resolution is provided for thesmallest values of delay so as to avoid differences which are tooperceptible in terms of voicing frequency. A resolution of 1/6 is used,for example, between rmin=21 and 33+5/6, a resolution of 1/3 between 34and 47+2/3, a resolution of 1/2 between 48 and 88+1/2, and integerresolution between 89 and rmax=142. Each possible delay is thusquantified by an integer index lying between 0 and N-1 =255.

The long-term prediction delay is determined in two stages. In the firststage, the open-loop LTP analysis module 36 detects the voiced frames ofthe speech signal and, for each voiced frame, determines a degree ofvoicing MV and a search interval for the long-term prediction delay. Thedegree of voicing MV of a voiced frame may take three values: 1 for theslightly voiced frames, 2 for the moderately voiced frames and 3 for thevery voiced frames. In the notation used below, a degree of voicing ofMV=0 is taken for the unvoiced frames. The search interval is defined bya central value represented by its quantification index ZP and by awidth in the field of quantification indices, dependent on the degree ofvoicing MV. For the slightly or moderately voiced frames (MV=1 or 2) thewidth of the search interval is of N1 indices, that is to say that theindex of the long-term prediction delay will be sought between ZP-16 andZP+15 if N1=32. For the very voiced frames (MV=3), the width of thesearch interval is of N3 indices, that is to say that the index of thelong-term prediction delay will be sought between ZP-8 and ZP+7 ifN3=16.

Once the degree of voicing MV of a frame has been determined by themodule 36, the module 30 carries out the quantification of the LSPparameters which were determined beforehand for this frame. Thisquantification is vectorial, for example, that is to say that itconsists in selecting, from one or more predetermined quantificationtables, a set of quantified parameters LSP_(Q) which exhibits a minimumdistance with the set of LSP parameters supplied by the module 28. In aknown way, the quantification tables differ depending on the degree ofvoicing MV supplied to the quantification module 30 by the open-loopanalyser 36. A set of quantification tables for a degree of voicing MVis determined, during trials beforehand, so as to be statisticallyrepresentative of frames having this degree MV. These sets are storedboth in the coders and in the decoders implementing the invention. Themodule 30 delivers the set of quantified parameters LSP_(Q) as well asits index Q in the applicable quantification tables.

The speech coder 16 further comprises a module 42 for calculating theimpulse response of the composite filter of the short-term synthesisfilter and of the perceptual weighting filter. This composite filter hasthe transfer function W(z)/A(z). For calculating its impulse responseh=(h(0), h(1), . . , h(lst-1)) over the duration of one sub-frame, themodule 42 takes, for the perceptual weighting filter W(z), thatcorresponding to the interpolated but unquantified LSP parameters, thatis to say the one whose coefficients have been calculated by the module32, and, for the synthesis filter 1/A(z), that corresponding to thequantified and interpolated LSP parameters, that is to say the one whichwill actually be reconstituted by the decoder.

In the second stage of the determination of the long-term predictiondelay TP, the closed-loop LTP analysis module 38 determines the delay TPfor each sub-frame of the voiced frames (MV=1, 2 or 3). This delay TP ischaracterised by a differential value DP in the domain of thequantification indices, coded over 5 bits if MV=1 or 2 (N1=32), and over4 bits if MV=3 (N3=16). The index of the delay TP is equal to ZP+DP. Ina known way, the closed-loop LTP analysis consists in determining, inthe search interval for the long-term prediction delays T, the delay TPwhich, for each sub-frame of a voiced frame, maximises the normalisedcorrelation: ##EQU3## where x(i) designates the weighted speech signalSW of the sub-frame from which has been subtracted the memory of theweighted synthesis filter (that is to say the response to a zero signal,due to its initial states, of the filter whose impulse response h wascalculated by the module 42), and Y_(T) (i) designates the convolutionproduct: ##EQU4## u(j-T) designating the predictable component of theexcitation sequence delayed by T samples, estimated by the well-knowntechnique of the adaptive codebook. For delays T shorter than the lengthof a sub-frame, the missing values of u(j-T) can be extrapolated fromthe previous values. The fractional delays are taken into account byoversampling the signal u(j-T) in the adaptive codebook. Oversampling bya factor m is obtained by means of interpolating multi-phase filters.

The long-term prediction gain g_(p) could be determined by the module 38for each sub-frame, by applying the known formula: ##EQU5## However, ina preferred version of the invention, the gain g_(p) is calculated bythe stochastic analysis module 40.

The stochastic excitation determined for each sub-frame by the module 40is of the multi-pulse type. An innovation sequence of lst samplescomprises np pulses with positions p(n) and amplitude g(n). Put anotherway, the pulses have an amplitude of 1 and are associated withrespective gains g(n). Given that the LTP delay is not determined forthe sub-frames of the unvoiced frames, a higher number of pulses can betaken for the stochastic excitation relating to these sub-frames, forexample np=5 if MV=1, 2 or 3 and np=6 if MV=0. The positions and thegains calculated by the stochastic analysis module 40 are quantified bya module 44.

A bit ordering module 46 receives the various parameters which will beuseful to the decoder, and compiles the binary sequence forwarded to thechannel coder 22. These parameters are:

the index Q of the LSP parameters quantified for each frame;

the degree of voicing MV of each frame;

the index ZP of the centre of the LTP delays search interval for eachvoiced frame;

the differential index DP of the LTP delay for each sub-frame of avoiced frame, and the associated gain g_(p) ;

the positions p(n) and the gains g(n) of the pulses of the stochasticexcitation for each sub-frame.

Some of these parameters may be of particular importance in the qualityof reproduction of the speech, or be particularly sensitive totransmission errors. A module 48 is therefore provided, in the coder,which receives the various parameters and adds redundancy bits to someof them, making it possible to detect and/or correct any transmissionerrors. For example, as the degree of voicing MV, coded over two bits,is a critical parameter, it is desirable for it to arrive at the decoderwith as few errors as possible. For that reason, redundancy bits areadded to this parameter by the module 48. It is possible, for example,to add a parity bit to the two MV coding bits and to repeat the threebits thus obtained once. This example of redundancy makes it possible todetect all single or double errors and to correct all the single errorsand 75% of the double errors.

The allocation of the binary data rate per 20 ms frame is, for example,that indicated in table I.

In the example considered here, the channel coder 22 is the one used inthe pan-European system for radio communication with mobiles (GSM). Thischannel coder, described in detail in GSM Recommendation 05.03, wasdeveloped for a 13 kbit/s speech coder of RPE-LTP type which alsoproduces 260 bits per 20 ms frame. The sensitivity of each of the 260bits has been determined on the basis of listening tests. The bitsoutput by the source coder have been grouped together into threecategories. The first of these categories IA groups together 50 bitswhich are coded by convolution on the basis of a generator polynomialgiving a redundancy of one half with a constraint length equal to 5.Three parity bits are calculated and added to the 50 bits of category IAbefore the convolutional coding. The second category (IB) numbers 132bits which are protected to a level of one half by the same polynomialas the previous category. The third category (II) contains 78unprotected bits. After application of the convolutional code, the bits(456 per frame) are subjected to interleaving. The ordering module 46 ofthe new source coder implementing the invention distributes the bitsinto the three categories on the basis of the subjective importance ofthese bits.

                  TABLE I                                                         ______________________________________                                        quantified parameters                                                                       MV = 0    MV = 1 or 2                                                                             MV = 3                                      ______________________________________                                        LSP           34        34        34                                          MV + redundancy                                                                              6         6         6                                          ZP            --         8         8                                          DP            --        20        16                                          g.sub.TP      --        20        24                                          pulse positions                                                                             80        72        72                                          pulse gains   140       100       100                                         Total         260       260       260                                         ______________________________________                                    

A mobile radio communications station able to receive the speech signalprocessed by the source coder 16 is represented diagrammatically in FIG.2. The radio signal received is first of all processed by a demodulator50 then by a channel decoder 52 which perform the dual operations ofthose of the modulator 24 and of the channel coder 22. The channeldecoder 52 supplies the speech decoder 54 with a binary sequence which,in the absence of transmission errors or when any errors have beencorrected by the channel decoder 52, corresponds to the binary sequencewhich the ordering module 46 delivered at the coder 16. The decoder 54comprises a module 56 which receives this binary sequence and whichidentifies the parameters relating to the various frames and sub-frames.The module 56 also performs a few checks on the parameters received. Inparticular, the module 56 examines the redundancy bits inserted by themodule 48 of the coder, in order to detect and/or correct the errorsaffecting the parameters associated with these redundancy bits.

For each speech frame to be synthesised, a module 58 of the decoderreceives the degree of voicing MV and the Q index of quantification ofthe LSP parameters. The module 58 recovers the quantified LSP parametersfrom the tables corresponding to the value of MV and, afterinterpolation, converts them into coefficients a_(i) for the short-termsynthesis filter 60. For each speech sub-frame to be synthesised, apulse generator 62 receives the positions p(n) of the np pulses of thestochastic excitation. The generator 62 delivers pulses of unitamplitude which are each multiplied at 64 by the associated gain g(n).The output of the amplifier 64 is applied to the long-term synthesisfilter 66. This filter 66 has an adaptive codebook structure. The outputsamples u of the filter 66 are stored in memory in the adaptive codebook68 so as to be available for the subsequent sub-frames. The delay TPrelating to a sub-frame, calculated from the quantification indices ZPand DP, is supplied to the adaptive codebook 68 to produce the signal udelayed as appropriate. The amplifier 70 multiplies the signal thusdelayed by the long-term prediction gain g_(p). The long-term filter 66finally comprises an adder 72 which adds the outputs of the amplifiers64 and 70 to supply the excitation sequence u. When the LTP analysis hasnot been performed at the coder, for example if MV=0, a zero predictiongain g_(p) is imposed on the amplifier 70 for the correspondingsub-frames. The excitation sequence is applied to the short-termsynthesis filter 60, and the resulting signal can further, in a knownway, be submitted to a post-filter 74, the coefficients of which dependon the received synthesis parameters, in order to form the syntheticspeech signal S'. The output signal S' of the decoder 54 is thenconverted to analogue by the converter 76 before being amplified inorder to drive a loudspeaker 78.

The open-loop LTP analysis process implemented by the module 36 of thecoder, according to a first aspect of the invention, will now bedescribed with reference to FIGS. 3 to 6.

In a first stage 90, the module 36, for each sub-frame st=0, 1, . . . ,nst-1 of the current frame, calculates and stores the autocorrelationsC_(st) (k) and the delayed energies G_(st) (k) of the weighted speechsignal SW for the integer delays k lying between rmin and rmax: ##EQU6##

The energies per sub-frame R0_(st) are also calculated: ##EQU7##

At stage 90, the module 36 furthermore, for each sub-frame st,determines the integer delay K_(st) which maximises the open-loopestimate P_(st) (k) of the long-term prediction gain over the sub-framest, excluding those delays k for which the autocorrelation C_(st) (k) isnegative or smaller than a small fraction ε of the energy R0_(st) of thesub-frame. The estimate P_(st) (k), expressed in decibels, is expressed:

    P.sub.st (k)=20. log.sub.10 [R0.sub.st /(R0.sub.st -C.sub.st.sup.2 (k)/G.sub.st (k))]

Maximising P_(st) (k) thus amounts to maximising the expression X_(st)(k)=C_(st) ² (k)/G_(st) (k) as indicated in FIG. 6. The integer delayK_(st) is the basic delay in integer resolution for the sub-frame st.Stage 90 is followed by a comparison 92 between a first open-loopestimate of the global prediction gain over the current frame and apredetermined threshold S0 typically lying between 1 and 2 decibels (forexample, S0=1.5 dB). The first estimate of the global prediction gain isequal to: ##EQU8## where R0 is the total energy of the frame (R0=R0₀+R0₁ +. . . +R0_(nst-1)), and X_(st) (K_(st))=C_(st) ² (K_(st))/G_(st)(K_(st)) designates the maximum determined at stage 90 relative to thesub-frame st. As FIG. 6 indicates, the comparison 92 can be performedwithout having to calculate the logarithm.

If the comparison 92 shows a first estimate of the prediction gain belowthe threshold S0, it is considered that the speech signal contains toofew long-term correlations to be voiced, and the degree of voicing MV ofthe current frame is taken as equal to 0 at stage 94, which, in thiscase, terminates the operations performed by the module 36 on thisframe. If, in contrast, the threshold S0 is crossed at stage 92, thecurrent frame is detected as voiced and the degree MV will be equal to1, 2 or 3. The module 36 then, for each sub-frame st, calculates a listI_(st) containing candidate delays to constitute the centre ZP of thesearch interval for the long-term prediction delays.

The operations performed by the module 36 for each sub-frame st (stinitialised to 0 at stage 96) of a voiced frame commence with thedetermination 98 of a selection threshold SE_(st) in decibels equal to adefined fraction β of the estimate P_(st) (K_(st)) of the predictiongain in decibels over the sub-frame, maximised at stage 90 (β=0.75typically). For each sub-frame st of a voiced frame, the module 36determines the basic delay rbf in integer resolution for the remainderof the processing. This basic delay could be taken as equal to theinteger K_(st) obtained at stage 90. The fact of searching for the basicdelay in fractional resolution around K_(st) makes it possible, however,to gain in terms of precision. Stage 100 thus consists in searching,around the integer delay K_(st) obtained at stage 90, for the fractionaldelay which maximises the expression C_(st) ² /G_(st). This search canbe performed at the maximum resolution of the fractional delays (1/6 inthe example described here) even if the integer delay K_(st) is not inthe domain in which this maximum resolution applies. For example, thenumber Δ_(st) which maximises C_(st) ² (K_(st) +δ/6)/G_(st) (K_(st)+δ/6) is determined for -6<δ<+6, then the basic delay rbf in maximumresolution is taken as equal to K_(st) +Δ_(st) /6. For the fractionalvalues T of the delay, the autocorrelations C_(st) (T) and the delayedenergies G_(st) (T) are obtained by interpolation from values stored inmemory at stage 90 for the integer delays. Clearly, the basic delayrelating to a sub-frame could also be determined in fractionalresolution as from stage 90 and taken into account in the first estimateof the global prediction gain over the frame.

Once the basic delay rbf has been determined for a sub-frame, anexamination 101 is carried out of the sub-multiples of this delay so asto adopt those for which the prediction gain is relatively high (FIG.4), then of the multiples of the smallest sub-multiple adopted (FIG. 5).At stage 102, the address j in the list I_(st) and the index m of thesub-multiple are initialised at 0 and 1 respectively. A comparison 104is performed between the sub-multiple rbf/m and the minimum delay rmin.The sub-multiple rbf/m has to be examined to see whether it is higherthan rmin. The value of the index of the quantified delay r_(i) which isclosest to rbf/m (stage 106) is then taken for the integer i, then, at108, the estimated value of the prediction gain P_(st) (r_(i))associated with the quantified delay r_(i) for the sub-frame in questionis compared with the selection threshold SE_(st) calculated at stage 98:

    P.sub.st (r.sub.i)=20. log.sub.10 [R0.sub.st /(R0.sub.st -C.sub.st.sup.2 (r.sub.i)/G.sub.st (r.sub.i))]

with, in the case of the fractional delays, an interpolation of thevalues C_(st) and G_(st) calculated at stage 90 for the integer delays.If P_(st) (r_(i))<SE_(st), the delay r_(i) is not taken intoconsideration, and stage 110 for incrementing the index m is entereddirectly before again performing the comparison 104 for the followingsub-multiple. If the test 108 shows that P_(st) (r_(i))≧SE_(st), thedelay r_(i) is adopted and stage 112 is executed before the index m isincremented at stage 110. At stage 112, the index i is stored in memoryat address j in the list I_(st), the value m is given to the integer m0intended to be equal to the index of the smallest sub-multiple adopted,then the address j is incremented by one unit.

The examination of the sub-multiples of the basic delay is terminatedwhen the comparison 104 shows rbf/m<rmin. Then those delays are examinedwhich are multiples of the smallest rbf/m0 of the sub-multiplespreviously adopted following the process illustrated in FIG. 5. Thisexamination commences with initialisation 114 of the index n of themultiple: n=2. A comparison 116 is performed between the multiplen.rbf/m0 and the maximum delay rmax. If n.rbf/m0>rmax, the test 118 isperformed in order to determine whether the index m0 of the smallestsub-multiple is an integer multiple of n. If so, the delay n.rbf/m0 hasalready been examined during the examination of the sub-multiples ofrbf, and stage 120 is entered directly, for incrementing the index nbefore again performing the comparison 116 for the following multiple.If the test 118 shows that m0 is not an integer multiple of n, themultiple n.rbf/m0 has to be examined. The value of the index of thequantified delay r_(i) which is closest to n.rbf/m0 (stage 122) is thentaken for the integer i, then, at 124, the estimated value of theprediction gain P_(st) (r_(i)) is compared with the selection thresholdSE_(st). If P_(st) (r_(i))<SE_(st), the delay r_(i) is not taken intoconsideration, and stage 120 for incrementing the index n is entereddirectly. If the test 124 shows that P_(st) (r_(i))>SE_(st), the delayr_(i) is adopted, and stage 126 is executed before incrementing theindex n at stage 120. At stage 126, the index i is stored in memory ataddress j in the list I_(st), then the address j is incremented by oneunit.

The examination of the multiples of the smallest sub-multiple isterminated when the comparison 116 shows that n.rbf/m0>rmax. At thatpoint, the list I_(st) contains j indices of candidate delays. If it isdesired, for the following stages, to limit the maximum length of thelist I_(st) to jmax, the length j_(st) of this list can be taken asequal to min(j, jmax) (stage 128) then, at stage 130, the list I_(st)can be sorted in the order of decreasing gains C_(st) ²(r_(Ist)(j))/G_(st) ² (r_(Ist)(j)) for 0≦j<j_(st) so as to preserve onlythe j_(st) delays yielding the highest values of gain. The value of jmaxis chosen on the basis of the compromise envisaged between theeffectiveness of the search for the LTP delays and the complexity ofthis search. Typical values of jmax range from 3 to 5.

Once the sub-multiples and the multiples have been examined and the listI_(st) has thus been obtained (FIG. 3), the analysis module 36calculates a quantity Ymax determining a second open-loop estimate ofthe long-term prediction gain over the whole of the frame, as well asindices ZP, ZP0 and ZP1 in a phase 132, the progress of which isdetailed in FIG. 6. This phase 132 consists in testing search intervalsof length N1 to determine the one which maximises a second estimate ofthe global prediction gain over the frame. The intervals tested arethose whose centres are the candidate delays contained in the listI_(st) calculated during phase 101. Phase 132 commences with a stage 136in which the address j in the list I_(st) is initialised to 0. At stage138, the index I_(st) (j) is checked to see whether it has already beenencountered by testing a preceding interval centred on I_(st') (j') withst'<st and 0<j'<j_(st'), so as to avoid testing the same interval twice.If the test 138 reveals that I_(st) (j) already featured in a listI_(st), with st'<st, the address j is incremented directly at stage 140,then it is compared with the length j_(st) of the list I_(st). If thecomparison 142 shows that j<j_(st), stage 138 is re-entered for the newvalue of the address j. When the comparison 142 shows that j=j_(st), allthe intervals relating to the list I_(st) have been tested, and phase132 is terminated. When test 138 is negative, the interval centred onI_(st) (j) is tested, starting with stage 148 at which, for eachsub-frame st', the index i_(st), is determined of the optimal delaywhich, over this interval, maximises the open-loop estimate P_(st)(r_(i)) of the long-term prediction gain, that is to say which maximisesthe quantity Y_(st') (i)=C_(st') ² (r_(i))/G_(st') (r_(i)) in whichr_(i) designates the quantified delay of index i for I_(st)(j)-N1/2≦i<I_(st) (j)+N1/2 and 0≦i<N. During the maximisation 148relating to a sub-frame st', those indices i for which theautocorrelation C_(st') (r_(i)) is negative are set aside, a priori, inorder to avoid degrading the coding. If it is found that all the valuesof i lying in the interval tested [I(j)-N1/2, I(j)+N1/2[ give rise tonegative autocorrelations C_(st') (r_(i)), the index i_(st'), for whichthis autocorrelation is smallest in absolute value is selected. Next, at150, the quantity Y determining the second estimate of the globalprediction gain for the interval centred on I_(st) (j) is calculatedaccording to: ##EQU9## then compared with Ymax, where Ymax representsthe value to be maximised. This value Ymax is, for example, initialisedto 0 at the same time as the index st at stage 96. If Y≦Ymax, stage 140for incrementing the index j is entered directly. If the comparison 150shows that Y>Ymax, stage 152 is executed before incrementing the addressj at stage 140. At this stage 152, the index ZP is taken as equal toI_(st) (j) and the indices ZP0 and ZP1 are taken as equal respectivelyto the smallest and to the largest of the indices i_(st'), determined atstage 148.

At the end of phase 132 relating to a sub-frame st, the index st isincremented by one unit (stage 154) then, at stage 156, compared withthe number nst of sub-frames per frame. If st<nst, stage 98 isre-entered to perform the operations relating to the followingsub-frame. When the comparison 156 shows that st=nst, the index ZPdesignates the centre of the search interval which will be supplied tothe closed-loop LTP analysis module 38, and ZP0 and ZP1 are indices, thedifference between which is representative of the dispersion on theoptimal delays per sub-frame in the interval centred on ZP.

At stage 158, the module 36 determines the degree of voicing MV, on thebasis of the second open-loop estimate of the gain expressed indecibels: Gp=20.log₁₀ (R0/R0-Ymax). Two other thresholds S₁ and S2 aremade use of. If Gp≦S1, the degree of voicing MV is taken as equal to 1for the current frame. The threshold S1 typically lies between 3 and 5dB; for example, S1=4 dB. If S1<Gp<S2, the degree of voicing MV is takenas equal to 2 for the current frame. The threshold S2 typically liesbetween 5 and 8 dB; for example, S2=7 dB. If Gp>S2, the dispersion inthe optimal delays for the various sub-frames of the current frame isexamined. If ZP1-ZP<N3/2 and ZP-ZP0≦N3/2, an interval of length N3centred on ZP suffices to take account of all the optimum delays and thedegree of voicing is taken as equal to 3 (if Gp>S2). Otherwise, ifZP1-ZP≧N3/2 or ZP-ZP0>N3/2, the degree of voicing is taken as equal to 2(if Gp>S2).

The index ZP of the centre of the prediction delay search interval for avoiced frame may lie between 0 and N-1=255, and the differential indexDP determined for the module 38 may range from -16 to +15 if MV=1 or 2,and from -8 to +7 if MV=3 (case of N1=32, N3=16). The index ZP+DP of thedelay TP finally determined may therefore, in certain cases, be lessthan 0 or greater than 255. This allows the closed-loop LTP analysis torange equally over a few delays TP smaller than rmin or larger thanrmax. Thus the subjective quality of the reproduction of the so-calledpathological voices and of non-vocal signals (DTMF voice frequencies orsignalling frequencies used by the switched telephone network) isenhanced. Another possibility is to take, for the search interval, thefirst or last 32 quantification indices of the delays if ZP<16 or ZP>240with MV=1 or 2, and the first or last 16 indices if ZP<8 or ZP>248 withMV=3.

The fact of reducing the delay search interval for very voiced frames(typically 16 values for MV=3 instead of 32 for MV=1 or 2) makes itpossible to reduce the complexity of the closed-loop LTP analysisperformed by the module 38 by reducing the number of convolutions y_(T)(i) to be calculated according to formula (1). Another advantage is thatone coding bit of the differential index DP is saved. As the output datarate is constant, this bit can be reallocated to coding of otherparameters. In particular, this supplementary bit can be allocated toquantifying the long-term prediction gain g_(p) calculated by the module40. In fact, a higher precision on the gain g_(p) by virtue of anadditional quantifying bit is appreciable since this parameter isperceptually important for very voiced sub-frames (MV=3). Anotherpossibility is to provide a parity bit for the delay TP and/or the gaing_(p), making it possible to detect any errors affecting theseparameters.

A few modifications can be made to the open-loop LTP analysis processdescribed above by reference to FIGS. 3 to 6.

According to a first variant of this process, the first optimisationsperformed at stage 90 relating to the various sub-frames are replaced bya single optimisation covering the whole of the frame. In addition tothe parameters C_(st) (k) and G_(st) (k) calculated for each sub-framest, the autocorrelations C(k) and the delayed energies G(k) are alsocalculated for the whole of the frame: ##EQU10##

Then the basic delay is determined in integer resolution K whichmaximises X(k)=C² (k)/G(k) for rmin≦k≦rmax. The first estimate of thegain compared at S0 at stage 92 is then P(K)=20.log₁₀ [R0/[R0-X(K)]].Next a single basic delay is determined around K in fractionalresolution rbf, and the examination 101 of the sub-multiples and of themultiples is performed once and produces a single list I instead of nstlists I_(st). Phase 132 is then performed a single time for this list I,distinguishing the sub-frames only at stages 148, 150 and 152. Thisvariant embodiment has the advantage of reducing the complexity of theopen-loop analysis.

According to a second variant of the open-loop LTP analysis process, thedomain [rmin, rmax] of possible delays is subdivided into nzsub-intervals having, for example, the same length (nz=3 typically), andthe first optimisations performed at stage 90 relating to the varioussub-frames are replaced by nz optimisations in the various sub-intervalseach covering the whole of the frame. Thus nz basic delays K₁ ', . . . ,K_(nz) ' are obtained in integer resolution. The voiced/unvoiceddecision (stage 92) is taken on the basis of that one of the basicdelays K_(i) ' which yields the largest value for the first open-loopestimate of the long-term prediction gain. Next, if the frame is voiced,the basic delays are determined in fractional resolution by the sameprocess as at stage 100, but allowing only the quantified values ofdelay. The examination 101 of the sub-multiples and of the multiples isnot performed. For the phase 132 of calculation of the second estimateof the prediction gain, the nz basic delays previously determined aretaken as candidate delays. This second variant makes it possible todispense with the systematic examination of the sub-multiples and of themultiples which are, in general, taken into consideration by virtue ofthe subdivision of the domain of the possible delays.

According to a third variant of the open-loop LTP analysis process, thephase 132 is modified in that, at the optimisation stages 148, on theone hand, that index i_(st'), is determined which maximises C_(st') ²(r_(i))/G_(st) (r_(i)) for I_(st) (j)-N1/2≦i<I_(st) (j)+N1/2 and 0≦i<N,and, on the other hand, in the course of the same maximisation loop,that index k_(st) ' which maximises this same quantity over a reducedinterval I_(st) (j)-N3/2≦i<I_(st) (j)+N3/2 and 0≦i<N. Stage 152 is alsomodified: the indices ZP0 and ZP1 are no longer stored in memory, but aquantity Ymax' is, defined in the same way as Ymax but by reference tothe reduced-length interval: ##EQU11##

In this third variant, the determination 158 of the voicing mode leadsmore often to the degree of voicing MV=3 being selected. Account is alsotaken, in addition to the previously described gain Gp, of a thirdopen-loop estimate of the LTP gain, corresponding to Ymax': Gp'=20.log₁₀[R0/(R0-Ymax')]. The degree of voicing is MV=1 if Gp≦S1, MV=3 if Gp'>S2and MV=2 if neither of these two conditions is satisfied. By thusincreasing the proportion of frames of degree MV=3, the averagecomplexity of the closed-loop analysis is reduced and robustness totransmission errors is enhanced.

A fourth variant of the open-loop LTP analysis process particularlyconcerns the slightly voiced frames (MV=1). These frames oftencorrespond to a start or to an end of a region of voicing. Frequently,these frames may include from one to three sub-frames for which the gaincoefficient of the long-term synthesis filter is zero or even negative.It is proposed not to perform the closed-loop LTP analysis for thesub-frames in question, so as to reduce the average complexity of thecoding. This can be carried out by storing in memory, at stage 152 ofFIG. 6, nst pointers indicating, for each sub-frame st', whether theautocorrelation C_(st') corresponding to the delay of index i_(st'), isnegative or even very small. Once all the intervals have been referencedin the lists I_(st'), the sub-frames for which the prediction gain isnegative or negligible can be identified by looking up the nst pointers.If appropriate, the module 38 is disabled for the correspondingsub-frames. This does not affect the quality of the LTP analysis, sincethe prediction gain corresponding to these sub-frames will in any eventbe practically zero.

Another aspect of the invention relates to the module 42 for calculatingthe impulse response of the weighted synthesis filter. The closed-loopLTP analysis module 38 needs this impulse response h over the durationof a sub-frame in order to calculate the convolutions y_(T) (i)according to formula (1). The stochastic analysis module 40 also needsit in order to calculate convolutions as will be seen later. The fact ofhaving to calculate convolutions with a response h extending over theduration of a sub-frame (lst=40 typically) implies relative complexityof coding, which it would be desirable to reduce, particularly in orderto increase the endurance of the mobile station. In certain cases, ithas been proposed to truncate the impulse response to a length less thanthe length of a sub-frame (for example, to 20 samples), but this maydegrade the quality of the coding. It is proposed, according to theinvention, to truncate the impulse response h by taking account, on theone hand, of the energy distribution of this response and, on the otherhand, of the degree of voicing MV of the frame in question, determinedby the open-loop LTP analysis module 36.

The operations performed by the module 42 are, for example, inaccordance with the flow chart of FIG. 7. The impulse response is firstof all calculated at stage 160 over a length pst greater than the lengthof a sub-frame and sufficiently long to be sure of taking account of allthe energy of the impulse response (for example, pst=60 for nst=4 andlst=40 if the short-term linear prediction is of order q=10). Thetruncated energies of the impulse response are also calculated at stage160: ##EQU12##

The components h(i) of the impulse response and the truncated energiesEh(i) may be obtained by filtering a unit pulse by means of a filterwith transfer function W(z)/A(z), with zero initial states, or even byrecursion, ##EQU13## for 0<i<pst, with f(i)=h(i)=0 for i<0,δ(0)=f(0)=h(0)=Eh(0)=1 and δ(i)=0 for i≠0. In expression (2), thecoefficients a_(k) are those involved in the perceptual weightingfilter, that is to say the interpolated but unquantified linearprediction coefficients, while, in expression (3), the coefficientsa_(k) are those applied to the synthesis filter, that is to say thequantified and interpolated linear prediction coefficients.

Next, the module 42 determines the smallest length Lα such that theenergy Eh(Lα-1) of the impulse response, truncated to Lα samples, is atleast equal to a proportion α of its total energy Eh(pst-1), estimatedover pst samples. A typical value of α is 98%. The number Lα isinitialised to pst at stage 162 and decremented by one unit at 166 aslong as Eh(Lα-2)>α.Eh(pst-1) (test 164). The length Lα sought isobtained when test 164 shows that Eh(Lα-2)≦α.Eh(pst-1).

In order to take account of the degree of voicing MV, a corrector termΔ(MV) is added to the value of Lα which has been obtained (stage 168).This corrector term is preferably an increasing function of the degreeof voicing. For example, values may be taken such as Δ(0)=-5, Δ(1)=0,Δ(2)=+5 and Δ(3)=+7. In this way, the impulse response h will bedetermined in a way which is all the more precise the greater the degreeof voicing of the speech. The truncation length Lh of the impulseresponse is taken as equal to Lα if Lα≦nst and to nst otherwise. Theremaining samples of the impulse response (h(i)=0 with i≧Lh) can bedeleted.

With the truncation of the impulse response, the calculation (1) of theconvolutions y_(T) (i) by the closed-loop LTP analysis module 38 ismodified in the following way: ##EQU14##

Obtaining these convolutions, which represents a significant part of thecalculations performed, therefore requires substantially fewermultiplications, additions and addressing in the adaptive codebook whenthe impulse response is truncated. Dynamic truncation of the impulseresponse, invoking the degree of voicing MV, makes it possible to obtainsuch a reduction in complexity without affecting the quality of thecoding. The same considerations apply for the calculations ofconvolutions performed by the stochastic analysis module 40. Theseadvantages are particularly appreciable when the perceptual weightingfilter has a transfer function of the form W(z)=A(z/γ₁)/A(z/γ₂) with0<γ₂ <γ₁ <1 which gives rise to impulse responses which are generallylonger than those of the form W(z)=A(z)/A(z/γ) which are more usuallyemployed in analysis-by-synthesis coders.

A third aspect of the invention relates to the stochastic analysismodule 40 serving for modelling the unpredictable part of theexcitation.

The stochastic excitation considered here is of the multi-pulse type.The stochastic excitation relating to a sub-frame is represented by nppulses with positions p(n) and amplitudes, or gains, g(n) (1≦n≦np). Thelong-term prediction gain g_(p) can also be calculated in the course ofthe same process. In general, it can be considered that the excitationsequence relating to a sub-frame includes nc contributions associatedrespectively with nc gains. The contributions are lst sample vectorswhich, weighted by the associated and summed gains, correspond to theexcitation sequence of the short-term synthesis filter. One of thecontributions may be predictable, or several in the case of a long-termsynthesis filter with several taps ("Multi-tap pitch synthesis filter").The other contributions, in the present case, are np vectors includingonly 0's except for one pulse of amplitude 1. That being so, nc=np ifMV=0, and nc=np+1 if MV=1, 2 or 3.

The multi-pulse analysis including the calculation of the gain g_(p)=g(0) consists, in a known way, in finding, for each sub-frame,positions p(n) (1≦n≦np) and gains g(n) (0≦n≦np) which minimise theperceptually weighted quadratic error E between the speech signal andthe synthesised signal, given by: ##EQU15## the gains being a solutionof the linear system g.B=b.

In the above notations:

X designates an initial target vector composed of the lst samples of theweighted speech signal SW without memory: X=(x(0), x(1), . . . ,x(lst-1)), the x(i)'s having been calculated as indicated previouslyduring the closed-loop LTP analysis;

g designates the row vector composed of the np+1 gains: g=(g(0)=g_(p),g(1), . . . , g(np));

the row vectors F_(p)(n) (0≦n≦nc) are weighted contributions having, ascomponents i (0≦i<lst), the products of convolution between thecontribution n to the excitation sequence and the impulse response h ofthe weighted synthesis filter;

b designates the row vector composed of the nc scalar products betweenvector X and the row vectors F_(p)(n) ;

B designates a symmetric matrix with nc rows and nc columns, in whichthe term B_(i),j =F_(p)(j).F_(p)(j)^(T) (0≦i, j<nc) is equal to thescalar product between the previously defined vectors F_(p)(i) andF_(p)(j) ;

(.)^(T) designates the matrix transposition.

For the pulses of the stochastic excitation (1≦n≦np=nc-1) the vectorsF_(p)(n) consist simply of the vector of the impulse response h shiftedby p(n) samples. The fact of truncating the impulse response asdescribed above thus makes it possible substantially to reduce thenumber of operations of use in calculating the scalar products involvingthese vectors F_(p)(n). For the predictable contribution of theexcitation, the vector F_(p)(0) =Y_(TP) has as components F_(p)(0) (i)(0≦i<lst) the convolutions y_(TP) (i) which the module 38 calculatedaccording to formula (1) or (1') for the selected long-term predictiondelay TP. If MV=0, the contribution n=0 is also of pulse type and theposition p(0) has to be calculated.

Minimising the quadratic error E defined above amounts to finding theset of positions p(n) which maximise the normalised correlationb.B⁻¹.b^(T) then in calculating the gains according to g=b.B⁻¹.

However, an exhaustive search for the pulse positions would require anexcessive amount of computing. In order to reduce this problem, themulti-pulse approach generally applies a sub-optimal procedureconsisting in successively calculating the gains and/or the pulsepositions for each contribution. For each contribution n (0≦n<nc), firstof all that position p(n) is determined which maximises the normalisedcorrelation (F_(p).e_(n-1) ^(T))² /F_(p).F_(p) ^(T)), the gains g_(n)(0) to g_(n) (n) are recalculated according to g_(n) =b_(n).B_(n) ⁻¹,where g_(n) =(g_(n) (0), . . . , g_(n) (n)), b_(n) =(b(0), . . . , b(n))and B_(n) ={B_(i),j }₀≦i,j≦n, then, for the following iteration, thetarget vector en is calculated, equal to the initial target vector Xfrom which are subtracted the contributions 0 to n of the weightedsynthetic signal which are multiplied by their respective gains:##EQU16##

On completion of the last iteration nc-1, the gains g_(nc-1) (i) are theselected gains and the minimised quadratic error E is equal to theenergy of the target vector e_(n-1).

The above method gives satisfactory results, but it requires a matrixB_(n) to be inverted at each iteration. In their article "AmplitudeOptimisation and Pitch Prediction in Multipulse Coders" (IEEE Trans. onAcoustics, Speech and Signal Processing, Vol. 37, no. 3, Mardh 1989,pages 317-327), S. Singhal and B. S. Atal proposed to simplify theproblem of the inversion of the B_(n) matrices by using the Choleskydecomposition: B_(n) =M_(n).M_(n) ^(T) in which M_(n) is a lowertriangular matrix. This decomposition is possible because B_(n) is asymmetric matrix with positive eigenvalues. The advantage of thisapproach is that the inversion of a triangular matrix is relativelystraightforward, B_(n-1) being obtainable by B_(n) ⁻¹ =(M_(n)⁻¹)^(T).M_(n) ⁻¹.

However, the Cholesky decomposition and the inversion of the matrixM_(n) require divisions and square-root calculations to be performed,which are demanding operations in terms of calculating complexity. Theinvention proposes to simplify the implementation of the optimisationconsiderably by modifying the decomposition of the matrices B_(n) in thefollowing way:

    B.sub.n =L.sub.n.R.sub.n.sup.T =L.sub.n.(L.sub.n.K.sub.n.sup.-1).sup.T

in which K_(n) is a diagonal matrix and L_(n) is a lower triangularmatrix having only 1's on its main diagonal (i.e. L_(n) =M_(n).K_(n)^(1/2) with the preceding notation). Having regard to the structure ofthe matrix B_(n), the matrices L_(n) =R_(n).K_(n), R_(n), K_(n) andL_(n) ⁻¹ are each constructed by simple addition of one row to thecorresponding matrices of the previous iteration: ##EQU17##

Under these conditions, the decomposition of B_(n), the inversion ofL_(n), the obtaining of B_(n) ⁻¹ =K_(n).(L_(n) ⁻¹)^(T).L_(n) ⁻¹ and therecalculation of the gains require only a single division per iterationand no square-root calculation.

The stochastic analysis relating to a sub-frame of a voiced frame (MV=1,2 or 3) may now proceed as indicated in FIGS. 8 to 11. To calculate thelong-term prediction gain, the contribution index n is initialised to 0at stage 180 and the vector F_(p)(0) is taken as equal to the long-termcontribution Y_(TP) supplied by the module 38. If n>0, the iteration ncommences with the determination 182 of the position p(n) of pulse nwhich maximises the quantity: ##EQU18## in which e=(e(0), . . . ,e(lst-1)) is a target vector calculated during the preceding iteration.Various constraints can be applied to the domain of maximisation of theabove quantity included in the interval [0, lst]. The inventionpreferably uses a segmental search in which the excitation sub-frame issubdivided into ns segments of the same length (for example, ns=10 forlst=40). For the first pulse (n=1), the maximisation of (F_(p).e^(T))²/(F_(p).F_(p) ^(T)) is performed over all the possible positions p inthe sub-frame. At iteration n>1, the maximisation is performed at stage182 on all the possible positions with the exclusion of the segments inwhich the positions p(1), . . . , p(n-1) of the pulses were respectivelyfound during the previous iterations.

In the case in which the current frame has been detected as unvoiced,the contribution n=0 also consists of a pulse with position p(0). Stage180 then comprises solely the initialisation n=0, and it is followed bya maximisation stage identical to stage 182 for finding p(0), with e=e₋₁=X as initial value of the target vector.

It will be noted that, when the contribution n=0 is predictable (MV=1, 2or 3), the closed-loop LTP analysis module 38 has performed an operationof a type similar to the maximisation 182, since it has determined thelong-term contribution, characterised by the delay TP, by maximising thequantity (Y_(T).e^(T))² /(Y_(T).Y_(T) ^(T)) in the delay T searchinterval, with e=e₋₁ =X as initial value of the target vector. It isalso possible, when the energy of the contribution LTP is very low, toignore this contribution in the process of recalculating the gains.

After stage 180 or 182, the module 40 carries out the calculation 184 ofthe row n of the matrices L, R and K involved in the decomposition ofthe matrix B, which makes it possible to complete the matrices L_(n),R_(n) and K_(n) defined above. The decomposition of the matrix B yields:##EQU19## for the component situated at row n and at column j. It canthen be said, for j increasing from 0 to n-1: ##EQU20## and, for j=n:##EQU21##

These relations are made use of in the calculation 184 detailed in FIG.9. The column index j is firstly initialised to 0, at stage 186. Forcolumn index j, the variable tmp is firstly initialised to the value ofthe component B(n,j), i.e.: ##EQU22##

At stage 188, the integer k is furthermore initialised to 0. Acomparison 190 is then performed between the integers k and j. If k<j,the term L(n,k).R(j,k) is added to the variable tmp, then the integer kis incremented by one unit (stage 192) before again performing thecomparison 190. When the comparison 190 shows that k=j, a comparison 194is performed between the integers j and n. If j<n, the component R(n,j)is taken as equal to tmp and the component L(n,j) to tmp.K(j) at stage196, then the column index j is incremented by one unit before returningto stage 188 in order to calculate the following components. When thecomparison 194 shows that j=n, the component K(n) of row n of the matrixK is calculated, which terminates the calculation 184 relating to row n.K(n) is taken as equal to 1/tmp if tmp≠0 (stage 198) and to 0 otherwise.It will be noted that the calculation 184 requires only one division 198at most in order to obtain K(n). Moreover, any singularity of the matrixB_(n) does not entail instabilities since divisions by 0 are avoided.

By reference to FIG. 8, the calculation 184 of the rows n of L, R and Kis followed by the inversion 200 of the matrix L_(n) consisting of therows and of the columns 0 to n of the matrix L. The fact that L istriangular with 1's on its principal diagonal greatly simplifies theinversion thereof as FIG. 10 shows. Indeed, it can be stated that:##EQU23## for 0≦j'<n and L⁻¹ (n,n)=1, that is to say that the inversioncan be done without having to perform a division. Moreover, as thecomponents of row n of L⁻¹ suffice for recalculating the gains, the useof the relation (5) makes it possible to carry out the inversion withouthaving to store the whole matrix L⁻¹, but only one vector Linv=(Linv(0),. . . , Linv(n-1)) with Linv(j')=L⁻¹ (n, j'). The inversion 200 thencommences with initialisation 202 of the column index j' to n-1. Atstage 204, the term Linv(j') is initialised to -L(n, j') and the integerk' to j'+1. Next a comparison 206 is performed between the integers k'and n. If k'<n, the term L(k',j').Linv(k') is subtracted from Linv(j'),then the integer k' is incremented by one unit (stage 208) before againperforming the comparison 206. When the comparison 206 shows that k'=n,j' is compared to 0 (test 210). If j'>0 the integer j' is decremented byone unit (stage 212) and stage 204 is re-entered for calculating thefollowing component. The inversion 200 is terminated when test 210 showsthat j'=0.

Referring to FIG. 8, the inversion 200 is followed by the calculation214 of the re-optimised gains and of the target vector E for thefollowing iteration. The calculation of the re-optimised gains is alsovery much simplified by the decomposition adopted for the matrix B. Thisis because it is possible to calculate the vector g_(n) =(g_(n) (0), . .. , g_(n) (n)), the solution of g_(n).B_(n) =b_(n) according to:##EQU24## and g_(n) (i')=g_(n-1) (i')+L⁻¹ (n,i').g_(n) (n) for 0≦i'<n.The calculation 214 is detailed in FIG. 11. Firstly, the component b(n)of the vector b is calculated: ##EQU25## b(n) serves as initialisationvalue for the variable tmq. At stage 216, the index i is alsoinitialised to 0. Next the comparison 218 is performed between theintegers i and n. If i<n, the term b(i).Linv(i) is added to the variabletmq and i is incremented by one unit (stage 220) before returning to thecomparison 218. When the comparison 218 shows that i=n, the gainrelating to the contribution n is calculated according to g(n)=tmq.K(n),and the loop for calculating the other gains and the target vector isinitialised (stage 222), taking e=X-g(n).F_(p)(n) and i'=0. This loopcomprises a comparison 224 between the integers i' and n. If i'<n, thegain g(i') is recalculated at stage 226 by adding Linv(i').g(n) to itsvalue calculated at the preceding iteration n-1, then the vectorg(i').F_(p)(i') is subtracted from the target vector e. Stage 226 alsocomprises the incrementation of the index i' before returning to thecomparison 224. The calculation 214 of the gains and of the targetvector is terminated when the comparison 224 shows that i'=n. It can beseen that it has been possible to update the gains while calling on onlyrow n of the inverse matrix L_(n) ⁻¹.

The calculation 214 is followed by incrementation 228 of the index n ofthe contribution, then by a comparison 230 between the index n and thenumber of contributions nc. If n<nc, stage 182 is re-entered for thefollowing iteration. The optimisation of the positions and of the gainsis terminated when n=nc at test 230.

The segmental search for the pulses substantially reduces the number ofpulse positions to be evaluated in the course of the stochasticexcitation search stages 182. It moreover allows effectivequantification of the positions found. In the typical case in which thesub-frame of lst=40 samples is divided into ns=10 segments of ls=4samples, the set of possible pulse positions may take ns!.ls^(np)/[np!(ns-np)!]=258,048 values if np=5 (MV=1, 2 or 3) or 860,160 if np=6(MV=0), instead of lst!/[np!(lst-np)!]=658,008 values if np=5, or3,838,380 if np=6 in the case in which it is specified only that twopulses may not have the same position. In other words, the positions canbe quantified over 18 bits instead of 20 bits if np=5, and over 20 bitsinstead of 22 if np=6.

The particular case in which the number of segments per sub-frame isequal to the number of pulses per stochastic excitation (ns=np) leads tothe greatest simplicity in the search for the stochastic excitation, aswell as to the lowest binary data rate (if lst=40 and np=5, there are 8⁵=32768 sets of possible positions, quantifiable over only 15 bitsinstead of 18 if ns=10). However, by reducing the number of possibleinnovation sequences to this point, the quality of the coding may beimpoverished. For a given number of pulses, the number of segments maybe optimised according to a compromise envisaged between the quality ofthe coding and the simplicity of implementing it (as well as therequired data rate).

The case in which ns>np additionally exhibits the advantage that goodrobustness to transmission errors can be obtained, as far as the pulsepositions are concerned, by virtue of a separate quantification of theorder numbers of the occupied segments and of the relative positions ofthe pulses in each occupied segment. For a pulse n, the order numberS_(n) of the segment and the relative position pr_(n) are respectivelythe quotient and the remainder of the Euclidean division of p(n) by thelength ls of a segment: p(n)=s_(n).ls+pr_(n) (0≦s_(n) <ns, 0≦pr_(n)<ls). The relative positions are each quantified separately on 2 bits,if ls=4. In the event of a transmission error affecting one of thesebits, the corresponding pulse will be only slightly displaced, and theperceptual impact of the error will be limited. The order numbers of theoccupied segments are identified by a binary word of ns=10 bits eachequal to 1 for the occupied segments and 0 for the segments in which thestochastic excitation has no pulse. The possible binary words are thosehaving a Hamming weight of np; they number ns!/[np!(ns-np)!]=252 ifnp=5, or 210 if np=6. This word can be quantified by an index of nb bitswith 2^(nb-1) <ns!/[np!(ns-np)!]≦2^(nb), i.e. nb=8 in the example inquestion. If, for example, the stochastic analysis has supplied np=5pulses with positions 4, 12, 21, 34, 38, the relative positions,quantified as scalars, are 0, 0, 1, 2, 2 and the binary wordrepresenting the occupied segments is 0101010011, or 339 when translatedinto decimal.

As for the decoder, the possible binary words are stored in aquantification table in which the read addresses are the receivedquantification indices. The order in this table, determined once and forall, may be optimised so that a transmission error affecting one bit ofthe index (the most frequent error case, particularly when interleavingis employed in the channel coder 22) has, on average, minimalconsequences according to a proximity criterion. The proximity criterionis, for example, that a word of ns bits can be replaced only by"adjacent" bits, separated by a Hamming distance equal at most to athreshold np-2δ, so as to preserve all the pulses except δ of them atvalid positions in the event of an error in transmission of the indexaffecting a single bit. Other criteria could be used in substitution orin supplement, for example that two words are considered to be adjacentif the replacement of one by the other does not alter the order ofassignment of the gains associated with the pulses.

By way of illustration, the simplified case can be considered where ns=4and np=2, i.e. 6 possible binary words quantifiable over nb=3 bits. Inthis case, it can be verified that the quantification table presented intable II allows np-1=1 correctly positioned pulse to be kept for everyerror affecting one bit of the index transmitted. There are 4 errorcases (out of a total of 18), for which a quantification index known tobe erroneous is received (6 instead of 2 or 4; 7 instead of 3 or 5), butthe decoder can then take measures limiting the distortion, for examplecan repeat the innovation sequence relating to the preceding sub-frame,or even assign acceptable binary words to the "impossible" indices (forexample, 1001 or 1010 for the index 6 and 1100 or 0110 for the index 7lead again to np-1=1 correctly positioned pulse in the event ofreception of 6 or 7 with a binary error).

In the general case, the order of the words in the quantification tablecan be determined on the basis of arithmetic considerations or, if thatis insufficient, by simulating the error scenarios on the computer(exhaustively or by a statistical sampling of the Monte Carlo typedepending on the number of possible error cases).

In order to make transmission of the occupied segment quantificationindex more secure, advantage can be taken, furthermore, of the variouscategories of protection offered by the channel coder 22, particularlyif the proximity criterion cannot be met satisfactorily for all thepossible error cases affecting one bit of the index. The ordering module46 can thus place in the minimum protection category, or the unprotectedcategory, a certain number nx of bits of the index which, if they areaffected by a transmission error, give rise to a word which is erroneousbut which satisfies the proximity criterion with a probability deemed tobe satisfactory, and place the other bits of the index in a betterprotected category. This approach involves another ordering of the wordsin the quantification table. This ordering can also be optimised bymeans of simulations if it is desired to maximise the number nx of bitsof the index assigned to the least protected category.

                  TABLE II                                                        ______________________________________                                        quantification index segment occupation word                                           natural     natural                                                  decimal  binary      binary     decimal                                       ______________________________________                                        0        000         0011       3                                             1        001         0101       5                                             2        010         1001       9                                             3        011         1100       12                                            4        100         1010       10                                            5        101         0110       6                                             (6)      (110)       (1001 or 1010)                                                                            (9 or 10)                                    (7)      (111)       (1100 or 0110)                                                                           (12 or 6)                                     ______________________________________                                    

One possibility is to start by compiling a list of words of ns bits bycounting in Gray code from 0 to 2^(ns) -1, and to obtain the orderedquantification table by deleting from that list the words not having aHamming weight of np. The the table thus obtained is such that twoconsecutive words have a Hamming distance of np-2. If the indices inthis table have a binary representation in Gray code, any error in theleast-significant bit causes the index to vary by ±1 and thus entailsthe replacement of the actual occupation word by a word which isadjacent in the meaning of the threshold np-2 over the Hamming distance,and an error in the i-th least-significant bit also causes the index tovary by ±1 with a probability of about 2^(1-i). By placing the nxleast-significant bits of the index in Gray code in an unprotectedcategory, any transmission error affecting one of these bits leads tothe occupation word being replaced by an adjacent word with aprobability at least equal to (1+1/2+. . . +1/2^(nx-1))/nx. This minimalprobability decreases from 1 to (2/nb)(1-1/2^(nb)) for nx increasingfrom 1 to nb. The errors affecting the nb-nx most significant bits ofthe index will most often be corrected by virtue of the protection whichthe channel coder applies to them. The value of nx in this case ischosen as a compromise between robustness to errors (small values) andrestricted size of the protected categories (large values).

As for the coder, the binary words which are possible for representingthe occupation of the segments are held in increasing order in a lookuptable. An indexing table associates the order number, at each address,in the quantification table stored at the decoder, of the binary wordhaving this address in the lookup table. In the simplified example setout above, the contents of the lookup table and of the indexing tableare given in table III (in decimal values).

The quantification of the segment occupation word deduced from the nppositions supplied by the stochastic analysis module 40 is performed intwo stages by the quantification module 44. A binary search is performedfirst of all in the lookup table in order to determine the address inthis table of the word to be quantified. The quantification index isthen obtained at the defined address in the indexing table then suppliedto the bit ordering module 46.

                  TABLE III                                                       ______________________________________                                        Address      Lookup table                                                                            Indexing table                                         ______________________________________                                        0            3         0                                                      1            5         1                                                      2            6         5                                                      3            9         2                                                      4            10        4                                                      5            12        3                                                      ______________________________________                                    

The module 44 furthermore performs the quantification of the gainscalculated by the module 40. The gain g_(TP) is quantified, for example,in the interval [0, 1.6], over 5 bits if MV=1 or 2 and over 6 bits ifMV=3 in order to take account of the higher perceptual importance ofthis parameter for the very voiced frames. For coding of the gainsassociated with the pulses of the stochastic excitation, the largestabsolute value Gs of the gains g(1), . . . , g(np) is quantified overfive bits, taking, for example, 32 values of quantification in geometricprogression in the interval [0, 32767], and each of the relative gainsg(1)/Gs, . . . , g(np)/Gs is quantified in the interval [-1, +1], over 4bits if MV=1, 2 or 3, or over five bits if MV=0.

The quantification bits of Gs are placed in a protected category by thechannel coder 22, as are the most significant bits of the quantificationindices of the relative gains. The quantification bits of the relativegains are ordered in such a way as to allow them to be assigned to theassociated pulses belonging to the segments located by the occupationword. The segmental search according to the invention further makes itpossible effectively to protect the relative positions of the pulsesassociated with the highest values of gain.

In the case where np=5 and ls=4, ten bits per sub-frame are necessary toquantify the relative positions of the pulses in the segments. The caseis considered in which 5 of these 10 bits are placed in a partlyprotected or unprotected category (II), and in which the other 5 areplaced in a more highly protected category (IB). The most naturaldistribution is to place the most significant bit of each relativeposition in the protected category IB, so that any transmission errorstend to affect the most significant bits and therefore cause only ashift of one sample for the corresponding pulse. It is advisable,however, for the quantification of the relative positions, to considerthe pulses in decreasing order of absolute values of the associatedgains, and to place in category IB the two quantification bits of eachof the first two relative positions as well as the most significant bitof the third one. In this way, the positions of the pulses are protectedpreferentially when they are associated with high gains, which enhancesaverage quality, particularly for the most voiced sub-frames.

In order to reconstitute the pulse contributions of the excitation, thedecoder 54 firstly locates the segments by means of the receivedoccupation word; it then assigns the associated gains; then it assignsthe relative positions to the pulses on the basis of the order of sizeof the gains.

It will be understood that the various aspects of the inventiondescribed above each yield specific improvements, and that it istherefore possible to envisage implementing them independently of oneanother. Combining them makes it possible to produce a coder ofparticularly beneficial performance.

In the illustrative embodiment described in the foregoing, the 13kbits/s speech coder requires of the order of 15 million instructionsper second (Mips) in fixed point mode. It will therefore typically beproduced by programming a commercially available digital signalprocessor (DSP), and likewise for the decoder which requires only of theorder of 5 Mips.

What is claimed is:
 1. Analysis-by-synthesis speech coding method for aspeech signal digitised into successive frames each divided into anumber nst of sub-frames, each sub-frame having a number lst of samples,comprising the steps of:linear prediction analysis of the speech signalin order to determine parameters of a short-term synthesis filter;open-loop analysis of the speech signal in order to detect voiced framesof the signal and in order, for each voiced frame, to determine a degreeof voicing of the signal and an interval for searching for a long-termprediction delay; closed-loop predictive analysis of the speech signalin order, for at least one of the sub-frames of the voiced frames, toselect a long-term prediction delay contained in the search interval andconstituting a parameter of a long-term synthesis filter; anddetermination of a stochastic excitation for each sub-frame, so as tominimise a perceptually weighted difference between the speech signaland the stochastic excitation filtered by the long-term and short-termsynthesis filters, wherein, in the open-loop analysis step, the searchinterval relating to each voiced frame is so determined as to contain anumber of delays which is dependent on the degree of voicing of saidframe.
 2. Method according to claim 1, wherein the interval forsearching for the long-term prediction delay contains fewer delays forthose frames having the greatest degree of voicing than for the othervoiced frames.
 3. Method according to claim 1 wherein the open-loopanalysis relating to a frame comprises a determination of nst basicdelays which each maximise an open-loop estimate of a long-termprediction gain over a respective sub-frame of said frame, then acomparison between a first predetermined threshold and a first open-loopestimate of the long-term prediction gain over the frame obtained on thebasis of the nst basic delays relating to the corresponding sub-framesin order to detect whether the frame is voiced, wherein, if the frame isdetected as voiced, the open-loop analysis further comprises, for eachsub-frame, a determination of a list of candidate delays for which theopen-loop estimate of the prediction gain over the sub-frame is higherthan a defined fraction of the estimate relating to the basic delay forthe sub-frame, wherein the candidate delay for which a second open-loopestimate of the long-term prediction gain over the frame is a maximum isselected from said lists, the second open-loop estimate over the frameassociated with a candidate delay being obtained on the basis of nstoptimal delays, lying in an interval of N1 delays which is centred onsaid candidate delay, which, respectively, over said interval, maximisethe open-loop estimate of the prediction gain over the nst sub-frames,wherein the determination of the degree of voicing of the framecomprises a comparison between the second maximised estimate of theprediction gain over the frame and at least one other predeterminedthreshold, and wherein the search interval determined on completion ofthe open-loop analysis is centred on said selected delay.
 4. Methodaccording to claim 1, wherein the open-loop analysis relating to a framecomprises a determination of a basic delay which maximises a firstopen-loop estimate of a long-term prediction gain over said frame, thena comparison between a first predetermined threshold and the firstmaximised estimate of the long-term prediction gain over the frame inorder to detect whether the frame is voiced, wherein, if the frame isdetected as voiced, the open-loop analysis further comprises adetermination of a list of candidate delays for which the open-loopestimate of the prediction gain over the frame is higher than a definedfraction of the estimate relating to the basic delay, wherein thecandidate delay for which a second open-loop estimate of the long-termprediction gain over the frame is a maximum is selected from said list,the second open-loop estimate over the frame associated with a candidatedelay being obtained on the basis of nst optimal delays, lying in aninterval of N1 delays which is centred on said candidate delay, which,respectively, over said interval, maximise the open-loop estimate of theprediction gain over the nst sub-frames, wherein the determining of thedegree of voicing of the frame comprises a comparison between the secondmaximised estimate of the prediction gain over the frame and at leastone other predetermined threshold, and wherein the search intervaldetermined on completion of the open-loop analysis is centred on saidselected delay.
 5. Method according to claim 1, wherein the open-loopanalysis relating to a frame comprises a determination of a number nz ofbasic delays which each, over a respective sub-interval of possibledelay values, maximise a first open-loop estimate of a long-termprediction gain over said frame, then a comparison between a firstpredetermined threshold and the largest of the first nz maximisedestimates of the long-term prediction gain over the frame in order todetect whether the frame is voiced, wherein, if the frame is detected asvoiced, the candidate delay for which a second open-loop estimate of thelong-term prediction gain over the frame is a maximum is selected fromamong nz candidate delays obtained from the nz basic delays, the secondopen-loop estimate over the frame associated with a candidate delaybeing obtained on the basis of nst optimal delays, lying in an intervalof N1 delays which is centred on said candidate delay, which,respectively, over said interval, maximise the open-loop estimate of theprediction gain over the nst sub-frames, wherein the determining of thedegree of voicing of the frame comprises a comparison between the secondmaximised estimate of the prediction gain over the frame and at leastone other predetermined threshold, and wherein the search intervaldetermined on completion of the open-loop analysis is centred on saidselected delay.
 6. Method according to claim 3, wherein, if the secondmaximised estimate of the prediction gain over a voiced frame is higherthan one of the thresholds, it is determined whether the nst optimaldelays lie within an interval centred on the selected delay andcontaining a number N3 of delays which is less than N1 and, if so, theframe is assigned a degree of voicing for which the interval forsearching for the long-term prediction delay contains N3 delays, thesearch interval containing N1 delays for at least one other degree ofvoicing.
 7. Method according to claim 3, wherein, during the maximisingof the second open-loop estimate of the long-term prediction gain over avoiced frame, a third open-loop estimate of the gain over the frame isalso calculated on the basis of nst delays, lying within an intervalcentred on the selected delay and containing a number N3 of delays whichis less than N1, which, respectively, over said interval of N3 delays,maximise the open-loop estimate of the prediction gain over the nstsub-frames, and wherein the frame is assigned a degree of voicing forwhich the search interval contains N3 delays if said third estimateexceeds a predetermined threshold, the search interval containing N1delays for at least one other degree of voicing.
 8. Method according toclaim 3, wherein the candidate delays of a list are chosen from amongthe sub-multiples of the basic delay which is associated with said listand from among the multiples of the smallest of said sub-multiples forwhich the open-loop estimate of the prediction gain is higher than saiddefined fraction of the estimate relating to the basic delay.
 9. Methodaccording to claim 8, wherein the long-term prediction delayscorresponds to integer or fractional numbers of samples of the speechsignal, wherein the basic delays are determined in fractional resolutionin order to search for the sub-multiples and the multiples to beincluded in a list of candidate delays, and wherein the basic delays aredetermined in integer resolution in order to evaluate the firstopen-loop estimates of the prediction gain over a frame.
 10. Methodaccording to claim 3, wherein the closed-loop predictive analysis is notcarried out in relation to each sub-frame for which the autocorrelationof the speech signal associated with the optimal delay for saidsub-frame is negative.
 11. Method according to claim 4, wherein, if thesecond maximised estimate of the prediction gain over a voiced frame ishigher than one of the thresholds, it is determined whether the nstoptimal delays lie within an interval centred on the selected delay andcontaining a number N3 of delays which is less than N1 and, if so, theframe is assigned a degree of voicing for which the interval forsearching for the long-term prediction delay contains N3 delays, thesearch interval containing N1 delays for at least one other degree ofvoicing.
 12. Method according to claim 4, wherein, during the maximisingof the second open-loop estimate of the long-term prediction gain over avoiced frame, a third open-loop estimate of the gain over the frame isalso calculated on the basis of nst delays, lying within an intervalcentred on the selected delay and containing a number N3 of delays whichis less than N1, which, respectively, over said interval of N3 delays,maximise the open-loop estimate of the prediction gain over the nstsub-frames, and wherein the frame is assigned a degree of voicing forwhich the search interval contains N3 delays if said third estimateexceeds a predetermined threshold, the search interval containing N1delays for at least one other degree of voicing.
 13. Method according toclaim 4, wherein the candidate delays of a list are chosen from amongthe sub-multiples of the basic delay which is associated with said listand from among the multiples of the smallest of said sub-multiples forwhich the open-loop estimate of the prediction gain is higher than saiddefined fraction of the estimate relating to the basic delay.
 14. Methodaccording to claim 13, wherein the long-term prediction delayscorresponds to integer or fractional numbers of samples of the speechsignal, wherein the basic delays are determined in fractional resolutionin order to search for the sub-multiples and the multiples to beincluded in a list of candidate delays, and wherein the basic delays aredetermined in integer resolution in order to evaluate the firstopen-loop estimates of the prediction gain over a frame.
 15. Methodaccording to claim 4, wherein the closed-loop predictive analysis is notcarried out in relation to each sub-frame for which the autocorrelationof the speech signal associated with the optimal delay for saidsub-frame is negative.
 16. Method according to claim 5, wherein, if thesecond maximised estimate of the prediction gain over a voiced frame ishigher than one of the thresholds, it is determined whether the nstoptimal delays lie within an interval centred on the selected delay andcontaining a number N3 of delays which is less than N1 and, if so, theframe is assigned a degree of voicing for which the interval forsearching for the long-term prediction delay contains N3 delays, thesearch interval containing N1 delays for at least one other degree ofvoicing.
 17. Method according to claim 5, wherein, during the maximisingof the second open-loop estimate of the long-term prediction gain over avoiced frame, a third open-loop estimate of the gain over the frame isalso calculated on the basis of nst delays, lying within an intervalcentred on the selected delay and containing a number N3 of delays whichis less than N1, which, respectively, over said interval of N3 delays,maximise the open-loop estimate of the prediction gain over the nstsub-frames, and wherein the frame is assigned a degree of voicing forwhich the search interval contains N3 delays if said third estimateexceeds a predetermined threshold, the search interval containing N1delays for at least one other degree of voicing.
 18. Method according toclaim 5, wherein the closed-loop predictive analysis is not carried outin relation to each sub-frame for which the autocorrelation of thespeech signal associated with the optimal delay for said sub-frame isnegative.